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ABSTRACT 


[n this this thesis, survival analysis Is used to study US Coast Guard enlisted 
attrition behavior in terms of individual personnel characteristics such as sex, marital 
status, race, paygrade and rating. Results obtained based on $ years of historical data 
from FY83 to FY90 are as follows: males and married individuals have higher survival 
probabilities than their counter parts, respectively; paygrades E-1 to E-5 have higher 
attrition than paygrades E-6 to E-9; American Indians have the highest attrition and 
Asian members have the highest survival probabilities; rating 170 (Gunner’s Mate) has 
the highest attrition over all ratings followed by rating 180 (Fire Control Technician); the 
rating with the highest survival probability is 570 (Aviation Machinist’s Mate); a 
decreasing trend in attrition was found during the last 4 years of the observation period; 
It was also observed that there was significantly high attrition at the end of the four years 
Service contract and when the enlisted member reachs twenty years of service. 

Additionally, this thesis provides a regression model in order to predict monthly 
enlisted attrition figures. Significant predictors selected are the prior month’s attrition. 
the number of enlistments four years prior and the current unemployment rate. The 
selected regression model explains almost 97% of the total variation of monthly attrition. 


It turns out to perform better than the current method used by the CG. 
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Executive Summary 


One of the responsibilities of the US Coast Guard Personnel 
Workforce Planning Office is to forecast enlisted monthly 
dwann ition. In order to forecast enlisted monthly attrition, 
it is necessary to analyze attrition behavior. The current 
method to forecast the number of enlisted attrition is based 
on the mean of the past eight years attrition figures of the 
corresponding month. This method facilitates simple 
implementation but it does not utilize other useful 
information such as economy, attrition behavior and individual 
characteristics associated with a specific kind of enlisted 
member. 

In this thesis, first, survival analysis was applied to 
investigate the attrition behavior of the US Coast Guard 
enlisted in terms of their individual characteristics such as 
sex, marital status, race, paygrade and rating. Results 
obtained based on the past eight years data from October of 
1982 to September of 1990 are as follows: males and married 
individuals have higher survival probabilities than their 
counterparts, respectively; paygrades E-1 to E-5 have higher 
attrition than paygrades E-6 to E-9; American Indians have 
the highest attrition and the Asian members the highest 
survival functions; rating 170 (Gunner's Mate) has the 


highest attrition over all ratings followed by rating 180 
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(Fire Control Tecinician)i the rating with the highest 
Survival probability is 570 (Aviation Machinist's Mate); 
there was a decreasing trend in attrition in the last 4 years 
of the observation period. The common feature of the survival 
functions was the significant fall at the end of the four 
years service contract and when the enlisted member reach 
twenty years of service. 

Secondly, based on the observations made in the survival 
analysis a candidate set of predictors was selected to fit a 
regression model for enlisted monthly attrition. The model 
finally selected contains the prior month’s attrition, the 
number of enlistments four years prior and the current 
unemployment rate as significant predictors. The selected 
regression model explains almost 97% of the total variation of 
monthly Vat: aeons Performance of the selected model is 
better than the current method used by US Coast Guard in terms 


of the mean squared error and the mean relative error. 


vili 


T. INTRODUCTION 


A. PROBLEM DESCRIPTION 

One of the responsibilities of the United States Coast 
Guard (CG) Personnel Workforce Planning (PWP) Office, located 
in Washington D.C., is to forecast personnel stocks, promotion 
requirements, monthly attrition and recruitment needs. The 
current method to forecast the number of enlisted attrition by 
paygrade per month is based on the mean of the past previous 
eight years attrition figures. For the corresponding month 
this method facilitates simple implementation. However, it 
does not utilize other useful information such as attrition 
behavior and individual characteristics associated with a 
specific kind of enlisted member. The CG Office of Personnel 
and Training maintains enlisted personnel records in addition 
to information concerning the main reasons for enlisted 
Am Cion. Recently a study was written (Ret. 1], 
examining enlisted attrition behavior and developing a model 
that projects these attrition figures. The study lacked of 
temporal stability, as it used personnel data for only one 
wear (FY91). The conclusions and modelling results were 


necessarily not very reliable. 


B.: OBJECTIVES 

This thesis uses eight years of historical data from FY83 
through FY90 to obtain the survival functions F ra 
individual personnel characteristics for survival analysis. 
Additionally, it applies regression models to predict monthly 
attrition figures and compares the performance of the 


resulting prediction model to those previously in use. 


Cy PERSONNEL BACKGROUND 

A summary from USGC LT Douglas Allen Blakemore’s thesis is 
used to describe the personnel background of the Coast Temsi] 
enlisted personnel structure. 

The Coast Guard enlisted personnel structure 1s composed 
of nine paygrades from E-1 to E-9. The paygrade E-1 contains 
recruits attending Coast Guard basic training school; paygrade 
E-2 1s composed of enlisted personnel who have completed basic 
training and have been assigned to active duty commands in 
preparation for attending a CG specialty school. Paygrades E-3 
through E-9 are livided into 24 active duty military 
occupational skills (MOS) or subspecialities. Of these, only 
22 are of concern since one (Musician), has an extremely small 
attrition rate and the other (Sonar Technician), no longer 
exists. The following list contains MOS; their respectiVouee 
abbreviation for subspecialty name and the rating code. It 
will be used throughout this thesis to calculate survival 


probabilities. 


MOS Subspecialty 
Aviation Machinist's Mate 
Aviation Electrician’s Mate 
Aviation Damage Controlman 
Aviation Survivalman 
poletion Electronic Technician 
Boatwain's Mate 
Damage Controlman 
Electrician's Mate 
Electronics Technician 
Bore Control Technician 
Gunner’s Mate 
Health Service Technician 
Machinery Technician 
Marine Science Technician 
Public Affairs Specialist 
Quartermaster 
Radarman 
Radioman 
Storekeeper 
Subsistence Specialist 
Telephone Technician 


Yeoman 


Aten listed man (BM) receives 


Specialty school or completing 


Rating Code 


AE 
AD 
ASM 
AT 
BM 
DC 
EM 
ET 
BĘ 
GM 
ES 
MK 
MST 
PA 
QM 


RD 


K 
25 
DAL 


YN 


570 


560 


s) 


530 


od 


100 


BJ 


270 


240 


ESO 


1 


870 


790 


340 


IEEE 


RW 


558 


420 


300 


260 


an MOS after completing a 


GM 


INCENSA YE 


on-the-job 


training program. Both methods are usually administered while 
the EM is in the E-3 paygrade. An E-3 who has not obtained an 
MOS is called non-rated; E-3's through E-9's hold onlysoneN ER 
at a time and usually maintain that MOS throughout eae. 
careers. 

Promotion to the next highest paygrade is determined by 
MOS and dictated by the needs of the CG. Promotions to 
paygrade E-5 through E-9 are vacancy driven while those to 
paygrades E-3 and E-4 occur on qualification for advancement. 
Promotion to paygrade E-2 occurs upon completion Of T 
ErZA MARIE 

Enlisted personnel sign service obligation contracts 
(enlistments) that require the individual to serve in the CG 
for a pre-determined number of years, usually four years. (In 
the past, there have been two year enlistments but these have 
been terminated.) Upon completion of an enlistment and upon 
approval of the CG, a person may sign a new contract (re- 
enlist), or separate from the CG. Enlisted attrition generally 
occurs due to 

¢ Retirement - After 20 years of active duty service. 


« Non-re-enlistments - An EM chooses not £0 reenlist eae 
CG chooses not to reenlist tHe ane meee 


* Administrative reasons - An EM may depart the CG prior to 
the end of his/her contract enlistment -" for the 
convenience of the government ". 


. Selection to an Officer program. 


e Death or disability. 


This study will concentrate on the first four reasons 
listed above and will use the word "Separation" referring to 


the EM leaving active duty service. 


D. SCOPE OF THE THESIS 

The first goal of this study is to develop survival 
functions for USCG EM personnel, in which survival analysis 
techniques can be used to analyze data on the length of time 
an EM remained in CG. This technique takes on different 
names, depending on the particular application at hand. 

Recently the term, "event history analysis", has been 
used by social scientists to describe applications for the 
analysis of the length of time it takes an employee to retire 
or resign from a given job [Ref. 2]. 

Survival analysis is a method for describing the 
distribution of the length of time for a given event, such as 
the termination of the service in the CG. One way to perform 
Survival analysis is to construct a histogram for the length 
of tame that individuals spend in service. Alternatively, one 
can use the length of service time as a dependent variable and 
determine if it can be predicted by variables such as marital 
Status, race, gender or military occupational skill. 

The main tool used in the survival analysis is the 
survival function calculated from the data using the frequency 
histogram in which for any given time t, the area under the 


shrve to Ehe left of t is the proportion of individuals in the 


population who separate from the CG to time t. A common 
feature of survival data is the presence of right censored 
observations due either to withdrawal of experimental units or 
termination of the experiment. For such observations it is 
only known that the lifetime exceeded the given value. The 
exact life time remains unknown. Such data cannot be analyzed 
by ignoring the censored observations because, among other 
considerations, the longer-lived observations are generally 
more likely to be censored. The survival distribusi sm 
function evaluated at t is the probability that an observation 
from the population will have a lifetime exceeding t, that is 
S(t) = Prob(T >t). Their estimators are called product Finus 
estimators. [Ref 2]. 

An analysis with the entire data set will be elaborated, 
in order to know some specific trends, behaviors and 
predominance of the individual characteristics of the data 
ünder study. 

A regression model will be developed to forecast monthly 
attrition and to establish relationships between explanatory 


variables r r en r r ie tee 


Tl. DATA OVERVIEW 


A. POPULATION 

nne USES Orfe OE Personnel and Training located in 
Washington D.C., provided the data for this study. The data 
set contained 27,2160 individual personal records for all CG 
enlisted personnel for the observation period from FY83 to 
FY90. The fields contained in each record are : Personnel 


Identification Number (PID), a generic number which identifies 


enEFindividual in the CG; rating, according to the military 
occupational skill; paygrade, from E-1 to E-9; sex, males 
and females; minority designator, race of the individual; 


marital status, married and single; date of entrance into the 
CG; date of separation from the CG; GAOCD, is a separation 
designator code for CG enlisted personnel; censor, which 
indicates whether the person remained in the CG or not. The 
data contains a record for each enlisted member per year of 
service in the CG, in other words, if a person remains in 
service for 8 years there exists 8 records for that 
mc i vidual. These multiple records were collapsed as one 
eerd per person, without loss of information. After this 
process, there were 50,036 records from which 29,405 belonged 
to people who left active duty service during the observation 


period and 20631 which belonged to censored active duty CG 


members, at the end of FY90 (30 Sep/90). This data includes 
cases where the CG allowed a member to separate from the CG 


and then rejoin in the future at the member former paygrade. 


ENLISTED ATTRITION BY MONTH EACH YEAR 
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Figure 1. Enlisted attrition by month each year. 


Based on this data set the number of months a person spent 
in the CG can be calculated, to analyze total time in service. 
In addition, censored information can be obtained. Figure 1 
shows the monthly attrition for each year from FY83 to Fae 
The attrition for Dec/83 appears to be an outlier since the 


attrition values for the other months that remain arema 


are no greater than 650. However, this value is preceded by 
the two smallest attrition values in the sample (25 and 30) 
and they both are under the next minimum value of attrition 


mre2) . 
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Biogure 2. First quarter FY84 fixed. 


In the future, analysis will conducted on the first 
quarter of FY84 as an average of the three attrition values 
porresponding te Oct/83 (25), Nov/83 (30) and Dec/83 (1,339). 
As can be seen in Figure 2, the new values for FY84 are in the 


ranges of all other fiscal years values. 


B. SUMMARY OF PERSONNEL 

Table 2 provides a comprehensive view of the stocks of 
personnel by paygrade of enlisted active duty (AC) at 30 
sep/90, and the total of enlisted members who separed (RE) 
during the observation period from starting FY83 to the end of 
FY90. Of note is the fact that 22 % of the total data is 
comprised of those members who are in the E-2 paygrade and the 


highest amount of attrition in paygrade E-4. 


Table 2. Active Duty Enlisted at 30 Sep/90 and Separated 


Enlisted Personnel from FY83 to FY90 


- PAYGRADE RCP asas SEPARATED TOTAL 


1781 
6894 10918 
1833 
2003 6255 
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20631 29405 50036 
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Table 3. Personnel by sex and Marital Status 








PAYGRADE SEX MARITAL STATUS TOTALS 


MALE | FEMALE 
E 1592 1666 1781 
RE 4272 4778 
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Table 3, present the stocks of personnel by sex and 
marital status. As can be observed, males, and single members 
are the dominant groups in the CG; females have reached 
pargrade E-8; married members (4,813) compared to single 
G left from the CG in low Paygrades (F-1 to E-4), in 
lower proportion compared with paygrades E-5 to E-9 (married 
(4,387), single (3,679)); there are small amount of single 


members in paygrades E-8 and E-9 in active duty. 


JL 


Table 4 contains the enlisted stocks classified by race. 
The findings observed here are: Caucasian has the largest 
stocks followed by Blacks, Hispanic, American Indians and 


Astana 


Table 4. Active Duty and Retired personnel by race 


PAYGRADE | RACE TOTAL 
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During the eight years of observation American Indians 
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have reached only paygrade E-7. Blacks, Hispanics, American 


Indians and Asian together are only the 13% of the population. 


Table 5, classifies each paygrade by rating. The non-rated 
eSt ed (MOS 150 and 320) comprise 49 % of the total 
population, the rating with the greatest amount of enlisted is 
200 (Machinery technician) with 9%, followed by rating 100 
(Boatwain’s Mate) with 6.6%. The rating least populated is 
180 (Fire Control Technician) with 114 enlisted (0.2%). 


Table 5. Enlisted personnel by MOS 
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III. SURVIVAL ANALYSIS 

This chapter presents methods for further quantifying the 
probability that an enlisted member will separate from the CG. 
The time reference is established as the number of months an 
enlisted member served in the CG. This number is computed 
from the date at which the individual joined the CG and the 
date when he or she retired. Only integer months were 
calculated and the number of days less than 30 were neglected. 
For example, an individual who served between 87/09/05 (year, 
month, day) and 90/02/20, has a calculated value, named MY, of 
29 months in active duty. With this information survival 
functions were developed by  paygrade and individual 
characteristics such as sex, race, marital status and rating. 
These calculations were performed using the SAS LIFETEST 


procedure. A program listing 1S included as Appendix A. 


A. SURVIVAL FUNCTIONS 

A first step in the analysis of the survival data is the 
estimation of the distribution of the failure times. The 
survival distribution function (SDF), also known as the 
survival function, is used to describe the lifetimes of the 
enlisted personnel in the CG. The SDF evaluated at t, is the 
probability that an enlisted member will have a lifetime 


exceeding t, that is S(t) = Prob(Tst) where S(t) denotes the 
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survival function and T is the lifetime of a randomly selected 
experimental unit. There are three types of plots commonly 
wed with survival functions: a plot of the estimated 
survival function against time, a plot of the negative natural 
log of the estimated survival function against time, and a 
plot of the natural log of the negative natural log of the 
estimated survival function against the natural log of time. 
The last two plots provide empirical checks of the 
appropriateness of the exponential model and the Weibull model 
respectively, for the survival data [Ref. 3]. The 
most important task in the analysis of CG attrition data is 
the comparison of survival curves. It is of interest to 
determine whether two or more strata share the same survival 
sbietions. The life test procedure can be used with data that 
may be right censored, (enlisted who actually served in the CG 
at 30 Sep/90) to compute nonparametric estimates of the 
siEvival distributions for each stratum and to perform rank 
tests for the association of CG attrition with other 
individual characteristics. The survival estimates are 
computed within the defined strata levels. Rank statistics 
and likelihood ratio tests are used to test homogeneity of the 
Survival function over different strata. 

mir ehe tollowing section global survival functions are 
Computed for individual characteristics such as sex, marital 


status, race, paygrade and rating. 
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1. Enlisted attrition by sex. 


ENLISTED ATTRITION BY SEX 
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Figure 3. Enlisted attrition by sex. 


As observed from the plot of the survival functions 
in Figure 3, records for females are limited to a maximum of 
180 months since they initially started serving in CG in s a 
Females have demonstrated a higher attrition rate than their 
counterparts. Comparatively higher attrition rates are found 
close to 48 months of service for both groups, Whales 
corresponds to the end of the first contract. The three tests 


used for homogeneity indicate that the survival functions for 
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the two sexes are significantly different. The log survival 
plot in figure 3, does not present a linear pattern through 
the origin which is required to support an exponential model, 
and the plot of the log of the negative log of the survival 
does not show a linear behavior in support of a Weibull model. 


2. Enlisted attrition by marital status. 
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Figure 4. Enlisted attrition by marital status. 


Figure 4 shows the plot of the survival function 
estimates where the probability to survive in the CG is 


higher for a married person than for a single member. The 


j; 


attrition is high in the starting period of service Mama 
decreases rapidly for single members once they finish their 
first four year contract as compared to that of married 
members. This behavior iS inverted when a member reaches 20 
years of service. Tests for homogeneity again, indicate that 
the survival functions for singles and married enlisted are 
Significantly different a anae R The log plots 
Support neither exponential nor Wiebull models for the data. 


3. Enlisted attrition by race 


ENLISTED ATTRITION BY RACE 
SURVIVAL FUNCTION ESTIMATES 
—— BLACK 


HISPANIC 
AMERICAN INDIAN 


SURVIVAL FUNCTION 
0.4 06 0.8 


0.2 


400 500 


MONTHS 
LOG(-LOG(SURVIVAL FUNCTION)) ESTIMATES 


0 


LOG(-LOG(S.F.)) 
-1 





Figure 5. Enlisted attrition by race. 
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The survival functions here show a similar pattern for 
Blacks and Hispanics. American Indians have the highest 
BErition. Asian groups show the highest survival 
probability followed by Caucasians. The plots of the logs 
conform neither to exponential nor Weibull models. 


4. Enlisted attrition by paygrade. 


ENLISTED ATTRITION BY PAYGRADE 
SURVIVAL FUNCTION ESTIMATES 
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Figure 6. Enlisted attrition by paygrade. 


The graph of the survival function estimates in Figure 
6, represents the highest attrition on low paygrades until the 


end of the 4 year contract. The attrition for paygrades E-6 
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to E-9, increases abruptly after reaching 20 years of service. 

The tests for homogeneity indicate that at least one of 
the survival functions are different from the others. 
However, in the negative log of the survival function against 
time plot, Figure 6, the paygrades E-1 to E-3 appear to be 
candidates for exponential models, while paygrades E-7 to E-9 
appear to follow that of a Wiebull model. Note that the 
general pattern 1S piecewise linear for each paygrade until 
the two deep drops around 48 and 240 months. 

5. Enlisted attrition by rating 

Figure 7 represent the attrition by each rating. There 


are three of the survival functions that do not folie ae 


pattern of the others. They correspond to the rating 170 
(Gunner’s Mate), followed in decreasing order by rating 280 
(Telephone Technician) and rating 870 (Health Service 
Technician). The rating 570 (Aviation Machinist’s Mate) is 


the group with the highest survival probability followed by 
rating 100 (Boatwain’s Mate). All survival functions” daa 


sharply around months defensa 0 


B. SUMMARY 

In this chapter, the survival functions were estimated in 
terms of individual characteristics of the enlisted member. 
Results obtained are as follow: males and married individuals 
have higher survival probabilities than their counterparts, 


respectively; paygrades E-1 to E-5 have higher attrition than 
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ENLISTED ATTRITION BY RATING 
SURVIVAL FUNCTION ESTIMATES 


Z 
= 
= 
© 
Z. 
= 
i 
= 
< 
> 
= 
Be 
= 
uN 


MONTHS 





Figure 7. Enlisted attrition by rating. 


paygrades E-6 to E-9; American Indians have the highest 
attrition and the Asian members the highest survival 
Fi Ctlons; rating 170 (Gunner’s Mate) has the highest 
s ion over all ratings followed by rating 180 (Fire 
so (Ol Technician); the rating with the highest survival 
probability is 570 (Aviation Machinist's Mate); there was a 
decreasing trend in attrition in the last 4 years of the 
observation period. The common feature of the survival 


functions was the significant fall at the end of the four 
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years service contract and when the enlisted member reach 
twenty years of service. 

In general the survival functions developed in this thesis 
have similar patterns for the paygrades, marital status and 
sex to those analyzed in the previous work by LT Blakemore. 

For rating and race, this thesis employed more strata than 


his and the results could not be compared. 


It appear that neither Weibull nor the exponential models 


can be fixed to the Survival damre 
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IV PREDICTION MODEL FOR MONTHLY ATTRITION 
This chapter deals with the study of multiple regression 
medels to predict monthly attrition. Monthly attrition is the 
dependent variable. The potential independent variables for 
predicting monthly attrition are: monthly attrition in the 
three previous months, the number of enlisted personnel who 
joined the CG four years ago and twenty years ago, 


respectively, seasonality, monthly unemployment rate and time. 


A. MULTIPLE REGRESSION MODEL 

The reason for using regression models is the desire to 
forecast. A forecast 1S a quantitative estimate (or set of 
estimates) about the likelihood of future events based on past 
and current information. Two types of forecasting can be 
applied: the point forecast, that predicts a single number in 
each forecast period; and the interval forecast, that 
indicates an interval in which the realized value will lie. 

The variables in the following list were used as a set of 
edate predictors to forecast monthly attrition (which will 
be denoted CONTEO). 


EXPLANATORY VARIABLES: 


NIL NONMBEGOOE ACER Eon in the previous month 
eZ li Or acer Cron Ewo Months ago 
MTS Múmber of attrition three months ago 
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XT 48 number of enlisted who entered four year ago 


XT 240 number of enlisted who entered twenty years ago 


MM time period starting from Oct/82 

TE square temor an 

D1 Dummy variable indicating first semester of each FY 
(Oct - Mar) 

PZ Dummy variable indicating second semester of each 


FY (Apr - Sep) 

RATA monthly unemployment rate of U.S. 

XT_48 and XT_240 were included because of the promimemee. 
in the Survival analysis. These two time periods (48 and 240) 
showed a significant change in the survival probabilities. 
The variables MM and TS were used to consider the 
observational period time trend in the attrition behavior. 
Two dummy variables, Dl and D2, were included to distinguish 
the potential difference in attrition behavior in two 
different semesters. Finally the monthly unemployment rate of 
U.S. was added to relate the attrition behavior to the 
condition of U.S. economy. These rates were taken from the 
Monthly Labor Review, a publication of the US Department of 
Labor and Bureau of Labor Statistics. Based on these, the 


initial model considered mas tas ollo. E 


CONTEO= 
P,YT 1+6,YT 2+p, YT-3+P, XT 48+) Mito Dom poro De 


Note that this model does not contain an intercept term. 


Using a backward elimination option of the PROC REG procedure 
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of the statistical package SAS the following three models were 


Selected to predict the monthly attrition. 


NOBEL 1 CONTEO = YT_1 XT_48 MM TS RATA 


MODEL 2 [CONTEO = YT_1 XT_48 RATA 


MODEL 3 CONTEO 


DIVINA 


Details are in Appendix B. 


MODEL VALIDATION 


MONTHLY ATTRITION PREDICTED WITH THE SELECTED MODEL 


—— ACTUAL 
PREDICTED 
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MONTHLY ATTRITION PREDICTED WITH USCG CURRENT METHOD 
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Figure 8. Validation of model 2 and current CG method. 


The predictors variables in the models are significant at 
evel ot 5 3 and the R* for all models is greater than 0.9569. 


Milenio the model. the 96 observations (one for 


ż5 


each month in eight fiscal years from Oct/82 to Sep/90) were 
used while 33 observations (Oct/90 to Jun/93) were employed to 
validate the model. As performance criteria, the mean squared 
error (MSE) and mean relative error (MRE) were used. 

The predicted attrition values of each true monthly 
attrition for FY91, FY92 and the first nine months O ESEE i 
were computed and are in Appendix C. Figure 8 contains the 
corresponding plots. The mean squared error and the mean 


relative error of each model are given as follows: 


Mean Squared Error and Mean Relative Error for FY91 


Model 1 Model 2 Model 3 CG Method 
MSE 1410.165 UMSS 817.225 43314 93 
MRE 02303226 0.13904 Cee asc J565 5 sx 


Mean Squared Error and Mean Relative Error for FY92 


Model 1 Model 2 Model 3 CG Method 
MSE J256 3171 94.482 ISS 13463500 
MRE (56561 0.08839 0.10979 VO VETE 


Mean Squared Error and Mean Relative Error for FY93 (9 months) 


Model 1 Model 2 Model 3 CG Method 
MSE 2304.833 505.083 533.362 3586.590 
MRE 0.52702 0.27350 0.28616 1.86012 


ZG 


B. RESULTS 

The results shown in the previous section indicate that 
model 2 predicts best the twelve month attrition for FY91, 
52 and the first nine months of FY93 in terms of not only 
the MSE but also the MRE. Tommo de Me "ZO coset the total 
variation in the attrition ls explained by the predictor 
variables such as XT_48, YT_1 and RATA. The correlation among 
these three predictors are low and the Durbin-Watson statistic 
is close to two indicating there is no significant first order 
autocorrelation in the residuals. 

In summary, in order to forecast a future monthly 


attrition one can use: 
CONTEO= 0.674663* YT 1+ 0.170029* XT 48+ 6.996215* RATA 

When YT_1 and RATA are unknown at the time of forecast, 
predicted values of YT 1 and RATA can replace actual YT_1 and 


RATA. 
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V. CONCLUSION AND RECOMMENDATIONS 


A. CONCLUSIONS 

The objective of this thesis was to analyze the attrition 
behavior and to predict monthly enlisted attrition of the US 
Coast Guard. 

First, survival analysis was used to investigate the 
attrition behavior of a USCG enlisted in terms of individual 
characteristics such as sex, marital status, rating, race and 
paygrade. Results obtained based on the past eight years 
(Oct /82 to Sep/90) data are as follows: males and married 
individuals have higher survival probabilities than their 
counterparts, respectively; paygrades E-1 to E-5 have higher 
attrition than paygrades E-6 to E-9; American Indians have 
the highest attrition and the Asian members the highest 
Ssurvival""FUDCETONSF rating 170 (Gunner’s Mate) has the 
highest attrition over all ratings followed by rating 180 
(Fire Control Technician) ; the rating with the highest 
survival probability is 570 (Aviation Machinist’s Mate); 
there was a decreasing trend in attrition in the last 4 years 
of observation period. The common feature of the survival 
functions was the significant fall at the end of the four 
years service contract and when the enlisted member reach 


twenty years of service. 


ŻE 


Secondly, a regression model was estimated to forecast the 
ienthly attrition: Significant predictors are the prior 
month’s attrition, the number of enlistments four years prior 
and the current unemployment rate as significant predictors. 
This model explains almost 97% variation in the monthly 
attrition. Performance of the regression model turns out to 


Memieetter than that of the current method used in CG. 


B. RECOMMENDATIONS 
The goals of this thesis were met. Recommendations for further 
Studies in CG enlisted attrition are listed bellow: 
1. Recommendation 1 
For future research it will be necessary to have a 
more manageable enlisted data base in order to facilitate 
necessary analysis. 
2. Recommendation 2 
It is recommended to formulate the CG enlisted 
attrition as a time series model such as moving average 
models, autoregressive models or a combination of both. 
3. Recommendation 3 
It is recommended that the forecast model be 


formulated for each rating. 
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APPENDIX A 
This source file was used to caiculate survival functions 
for each individual characteristics such as sex, marital 
Status, race, paygrade and rating. 
OPTION PINE S IT2E Gü; 


DATA ONE? SE! HELE Behe 


CENSORED = (DEPART IIA ; 

E EA A THEN COU Eee ak le cee 

ELSE Ine se ae THEN GP IT D J GE 
EWSE IF E po TlEN S GPF IPE U FE GE 
T et THEN GROUP -H EEE 
Eis eee eae PRENRNSROWE A D. 
BLSE CIFP PA S PRENNGROBGER="AP E. 
ELSE ee ee ayy TRENT GROUP SE a: 
¡ASI LS E THEN GROUP TSE E E. 


ELSE T CROUPE TE aa, 

PROC LIFETESTOUTSURV=ZOUTIFPLCOTS EO EEE 
TIME IIS CGCEN EE T > 
STRATA GROUP ; 

DATA PO Er i DAP 

DROP GROUP Sb She io) ede > 

IF, CENSOR. = ENE 

DATA TRES $ SET DOSNDROP= ZCENSORENE 


ERUSAE ANO: 
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APPENDIX B 
This is the source file to estimate and validate the 3 
regression models. 
SETION LINESIZE=80; 
DATA ONE; 
INPUT MM CONTEO XT_48 RATA D1 D2 ; 


CARDS; 


MATA DOS; SET ONE; 
PEBL = LAGI(CONTEO) ; 
oPEZ = LAG2 (CONTEO) ; 
TS = MM * MM ; 
KUTI < 97 ; 
PROC CORR; 
PROC REG ; 
Peer CONTEO = YT ol YT 2 XT_48 MM TS RATA / NOINT Dw ; 
PROC REG ; 
MODEL CONTEO = YT 1 xT 46 MM TS RATA / NOINT Dw ; 
PROC REG; 
|DEBESCONTEQD =" MVP 1 XT 46 MM RATA / NOINT DW ; 
BRÓĆ REG + 
| |ODEESCONTECFO="YT | XM 48 RATA / NOINT DW ; 
PROC REG; 


MODEN CONTEO” = YT_1 YT 2 / NOINT Dw ; 


Sk 


DATA NEW ; ET DD D; 

DROP YIEE 

IF MM LOS AND A le 

NEWM1 = 0.558604*YT 1 + 0.162241*XT 48 + 9 085655 p i 


2.094146*MM = DUZO ZE 


sQl (CONTEO: "ANNE Q 


REL ABS (CONTEQ — NEWM SE T ONTOS 


NEWM2 = 0.674663*YT_1 + 0.170029*xXT 48 + 6.996215 "2A = 


502 


(CONTEO NENA Rp: 


REZ ABS (CONTEO L b C h E K 


NEWM4 = 0. 795277 TT 


S04 p COTEPO S JE haa 


RE4 = ABS (CONTEO Sp p Pr p I 1 K 


DATA NEW1 ; SET NEW ; 

DROP MM YT_1 XT_48 TS D1 D2 RATA ; 

PROC SUMMARY ; VAR SQ1 SQ2 SQ4 RE1 RE2 RE4 ; 

OUTPUT OUT = OUT1 MEAN = MSQ1 MSQ2 MSQ4 MRE1 MRE2 MRE4 ; 


PRÓC PRINT DATA OUR 


PROG Rite Are NEWI 
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Correlation Analysis 


9 ‘VAR’ Variables: MM CONTEO 28 RANA PIDA. A NT TS 
Simple Statistics 
Variable N Mean Std Dev Sum Minimum Maximum 
MM 96 48.50000 27.85678 4656 1: 00000 96.00000 
CONTEO 96 306.29167 106.72453 29404 1209000060 546200000 
XT_48 96 303.02083 EOS 29090 38.00000 649.00000 
RATA 96 6.87917 1.56937 660: 40000 5-00000 LO. 90000 
ml 96 051042 050252 49.00000 0 1566000060 
D2 96 0 0 0 0 0 
ll 95 307.90526 106.10694 29251 12 06000 546700000 
T? 94 309.31915 105782235 29076 1200000 516500000 
TS 96 3120 2789 299536 1200000 9216 
Correlation Analysis 
Peeeecn Correlation Coefficients / Prob > !R! under Ho: Rho=0 / Number of 


"reservations 


MM CONTEO XT 48 RATA El 

MM 100000 = 0 UNS SOTA LTO SU 010791 
0.0 0 0001 0 - 0001 70001 0295 
96 96 96 96 95 

CONTEO OBO LZS 1500900090 0530651215 0.44194 SD 
TELO 0-0 050001 0. 0091 Oui 

JE 96 26 al 96 

XT_48 O 1096 0.53645 1.000000 0223412 =U T4932 
0" 0001 0-000 0-0 0.0013 0. 1465 

J5 96 96 96 35 

RATA Ore es 0.44194 0 32342 T000 0O 0.12041 
05001 0.0001 O20013 BZU 0.2426 

J5 SĘ 96 oo 96 

D1 AS E SD UA a 0. 12041 1.00000 

DEE DELS 0.1465 0.2426 0 0 

96 36 36 Jo J6 

AJ URSD NE OS 0.43099 0.42469 =U 25200 
O50001 O50001 0.0001 00001] OD 101 

95 95 95 95 a 

DZ USOS USO 4 0.421722 6.399143 0 20205 
0000] AA 000501 OT OOOI 0. 0509 

94 94 94 94 94 

TS 0.96888 oc eal We 366 ADS 2029 =0 09517 
0 00061 0.0001 O20 001 0 00901 0 03 

gG oie 96 96 J65 


CONTEO 


XT_48 


RATA 


DI 


Y Dil 


SIĘ Z 


TS 


D2 


96 


96 


96 


96 


96 


25 


94 


516 


yew 
- O06080) 
0. 0001 
95 


OSOS 
0.9001 
95 


0.43099 
0.0001 
55 


0.42469 
0.0 00a! 
SS 


=O). 26260 
0 "0101 
35 


1.00000 
AE, 
95 


0.80644 
0 0001 
94 


=O 6a 50 
0.0097 
95 


Yr 2 
20.59948 
9. 0001 
94 


0.269724 
0.0001 
94 


0.42722 
0.0001 
94 


0.39914 
OR Oi 
94 


720205 
00509 
94 


0.80644 
0.0001 
94 


00000 
00 
94 


=0E6T2 
0.0001 
94 
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TS 
0.563968 
9 0001 
96 


ZON o 
050001 
96 


ES 
00901 
96 


=O 2029 
0.0001 
516 


US 
0.3563 
96 


=0) 68260 
0.0001 
25 


-0.67642 
0.0001 
94 


TI 00000 
0.0 
96 





Model: MODEL 


CONTEO = £(XT_48, RATA, YT_1, YT_2, TS, MM) 


NOTE: No intercept in model. R-square is redefined. Dependent Variable: CONTEO 


Analysis of Variance 


Sum of Mean 

Source DF Squares Square F Value Prob>F 
Model 6 9640270244 1601900.3374 485.067 0.0001 
Error 88 290254.97562 3298.35200 
Ueróotal 94 9901657 

Root MSE 51143128 R-square Cao UZ 

Dep Mean 20625 HOG Adj R-sq 0 5657 

Gave 18.74688 

Parameter Estimates 
Parameter Standard T for H0: 

Variable DF Estimate Error Parameter=0 Prob > !T! 
Yor | a 0.569087 ROZPO Z 5615 C20 001 
Ke ] -0.009225 0. 10036782 -0.092 0.9270 
20 48 T O USAS] 0.05108531 Se OEG 
MM i 2.209609 0.97305540 22 Ml USCZEB 
TS il -0.026221 0. 01081079 -2.425 0 0173 
RATA ] S.529324 3.38234431 22522 0.055 
Durbin-Watson D 1.961 
(For Number of Obs.) 94 
lst Order Autocorrelation 0.009 
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Model: 


Source 


Model 
Error 
USTeueal 


Variable 


vied 
XT_48 
MM 
ms 
RATA 


Durbin-Watson D 


MODEL CONTEO = 
NOTE: No intercept in model. 


f (XT_48, 


R-square is redefined. Dependent Variable: 


RATA, YT_1, TS, MM) 
CONTEO 


Analysis of Variance 


(For Number of Obs.) 


Ist Order Autocorrelation 


Sum of Mean 
DF Squares Square F Value Prob>F 
5 9723932.3348 1944786.467 599.485 0.00 GH 
90 291968.66517 3244 .09628 
95 10015901 
Root MSE 5610505697 R-square 0.9708 
Dep Mean 306.68421 Adj R-sq 0.9692 
E 18.57186 
Parameter Estimates 
Parameter Standard T for HO: 

DF Estimate Error Parameter=0 Prob > !T! 
T 0 558604 0.079507738 7.026 0.0001 
T 0-162241 0.05018257 3.2) 0.0607 
1 2.094146 0.90905311 2503 0.0235 
l =D O02511. 0 0015 0 -2.477 0:0151 
1 6 055655 3 ⁄ 12547 n] 2.898 0 0033 

1.979 
95 
0-007 
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Model: MODEL 
NOTE: No intere 


Source 


Model 
BALON 
We Tota l 


Variable DF 


> j 
XT_48 
MM 

RATA 


KA KA KA KA 


Durbin-Watson D 
(For Number of 
iss r er Autoc 


CONTEO = f(XT_48, RATA, YT_1, MM) 


ept in model. R-square is redefined. Dependent Variable: CONTEO 


Analysis of Variance 


Sum of Mean 

DF Squares Square F Value Prob>F 
4 9704026.5996 2426006.6499 ano. 0 G00 1 
gT ARAO Sa A SSL 

33 EGC015901 

Root MSE SS 22 R-square 0.9689 

Dep Mean 306.68421 Ad) R-sq 55675 

CV. 19.08876 


Parameter Estimates 


Parameter Standard T for HO: 
Estimate Error Parameter=0 Prob > !T! 
On aos 0306576463 107272 0.0001 
0.173063 0.05138446 3.368 ODA 
DIAS O OSTs => Ga =0.750 04550 
7.569406 35.100755 PS 2.395 0.0187 
2.087 
Obs.) 95 
errelatión. —0 2050 


cy 


Model: MODEL CONTEO = £(XT_48, RATA, YT_1) 
NOTE: No intercept in model. R-square is redefined. Dependent Variable: CONTEO 


Source 


Model 
ERRON 
U Total 


Analysis of Variance 


Sum of Mean 
DF Squares Square F Value Prob>P 
3 9702097.409 3234032.4697 948.144 0.0008 
92 313604459101 3410 .90860 
95 10015901 
Root MSE 58.40298 R-square 0.9687 
Dep Mean 306639441 Ad) R-sqg 0.9676 
CN 19.04336 


Parameter Estimates 


Parameter Standard T for HO: 
Variable DF Estimate Error Parameter=0 Prob > !T! 
Vell 75 0.674663 0 0655055544 1025265 0-0001 
XT_48 il 0.170029 0051103524 35327 O ODA 
RATA J 6.996215 3.05971911 2.287 0.0245 
Durbin-Watson D 210772 
(For Number of Obs.) 95 


lst Order Autocorrelation 0.014 
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Model: MODEL CONTEO = f(YT 1, YT_2) 
NOTE: No intercept in model. R-square is redefined. Dependent Variable: CONTEO 


Analysis of Variance 


Sum of Mean 

Source DF Squares Square F Value Prob>F 
Model 2/552.RR>=1 OSA 760506.7447 PESO: S95 0.0001 
Fr T Or 92 3280645 .51056 abe 7.42546 
U Total 94 9901657 

Root MSE 64.322385 R-square wogle 

Dep Mean 00333106 AdJ R=sq ONES UA 

CV. 20 355045 


Parameter Estimates 


Parameter Standard T for HO: 
Variable DF Estimate Error Parameter=0 Prob > !T! 
UNE T 1 05295 71 5190156164 u 5922 0 080 1 
GREŻ 1 0.193608 010135320 SO 0.0592 
Durbin-Watson D 2.003 
(For Number of Obs.) 94 


Border Autocorrelation -0.012 


APPENDIX C 


True and fitted attrition calculated with the selected mode oE m 
the current method used by USCG. 


True Predicted Predicted 
Attrition Attrition Attrition 
Model 2 Current Method 

FY91 

OGE/3U 452 379.454 a SO 
Nov/90 366 > BORSZĘ r 565250 
Dec /90 > ik 266.937 5672500 
Jem, 31 306 296 000 2561050 
Feb/91 280 302 RSE O 
Mar/91 22 515-656 503.720 
Apr/91 OB 334.110 5002 500 
May/91 SAI Ee aa O 
Jun/91 299 3104 849 250058 
OWE 221 383 SZR BRZ 541285 
Aug/91 446 A. lee I DOD 
Sep/91 419 3 / 092 s SS 
FY92 

OCEAN 391 Scan 262.900 
Nov/91 319 Pez 259 27) 50 
Dec/91 293 ZU 505 3705505) 
Jan/ 92 338 So Sa ZION 
Feb/92 262 254 T 2902 5) 
Mar / 92 296 Zoo 479 210 2625 
Apr/92 305 2923 031 304 000 
May/92 283 26 5674 340.500 
Jun/92 27 290.604 SPOTS 
Jul/92 339 SAGs ces OZ 
Aug/92 42 2527766 5455375 
Sep /92 318 291.120 EAS 


FY93 (nine months) 


OSI Z 292 2132406 3076848 
Nov/92 foe WYS boo ES 
Dec/92 308 284.484 A 0 
Jan O. — DS AA AS SS 
Feb/93 ŻA 256 763 E 0 E 
Mar 99 SOB 2 AOĘZ OO ors 
Apr/93 244 2505 290975 
May / 9s oo 220055 by (peal aes 
JUAS 199 a ERE 
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